fix(mcp): use tiktoken for response-size-guard token estimation by aminghadersohi · Pull Request #39912 · apache/superset

aminghadersohi · 2026-05-06T16:52:04Z

SUMMARY

The MCP response-size-guard middleware (ResponseSizeGuardMiddleware) estimates token counts to decide when to truncate or block oversized tool responses. The existing estimator at superset/mcp_service/utils/token_utils.py used a simple char-to-token heuristic (CHARS_PER_TOKEN = 3.5) that miscounts JSON-heavy MCP responses relative to Claude's actual tokenizer. Specific responses could slip past the configured token limit while still being truncated by the Claude Agent SDK's own threshold — the SDK then saved them into a file the model could not read back, causing 120s timeouts in tool calls like get_dataset_info for wide datasets.

This PR switches the estimator to tiktoken's cl100k_base encoding — a real BPE tokenizer with a vocabulary similar to Claude's. For English and JSON-heavy content it tracks Claude's counts within roughly ±10%, which is far closer than any character-ratio heuristic.

The previous heuristic stays as a graceful fallback for environments where tiktoken is not installed; its ratio drops from 3.5 → 3.0 chars/token to be more conservative for JSON content (which under-counted before).

BEFORE/AFTER

estimate_token_count("a 80KB JSON dataset info response")
  before (3.5 chars/token):       ~22,800 tokens (slipped past 25k cap)
  after (tiktoken cl100k_base):   accurate Claude-aligned count

TESTING INSTRUCTIONS

pytest tests/unit_tests/mcp_service/utils/test_token_utils.py -v
pytest tests/unit_tests/mcp_service/test_middleware.py -v

New unit tests cover:

tiktoken-loaded path produces non-zero counts
bytes input matches string input
Length monotonicity (doubling input ≈ doubles count, ±10%)
Fallback path when _ENCODING is None (tiktoken not installed) uses len/CHARS_PER_TOKEN
Defensive fallback when tiktoken's encode raises — the size guard must never fail-open

ADDITIONAL INFORMATION

New dependency: tiktoken>=0.7.0,<1.0 added to the fastmcp extra in pyproject.toml. Anyone installing apache-superset[fastmcp] gets it automatically. requirements/base.txt and requirements/development.txt regenerated via scripts/uv-pip-compile.sh.
No network calls: tiktoken is pure offline tokenization. Anthropic's count_tokens API is more accurate but adds a network roundtrip per tool result, which is too expensive for synchronous middleware.
Behavioral change: previously-passing token estimates for the same content will now report different (more accurate) numbers. Sites relying on a specific cap will see different effective behavior — typically slightly more conservative truncation for English-text-heavy responses, slightly less for highly repetitive content (BPE compresses repetition).
Has associated issue:
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
Introduces new feature or API
Removes existing feature or API

The MCP response-size-guard middleware estimates token counts to decide when to truncate or block oversized tool responses. The existing estimator used a simple char-to-token heuristic (CHARS_PER_TOKEN = 3.5) that miscounts JSON-heavy MCP responses relative to Claude's actual tokenizer: estimate_token_count("a 80KB get_dataset_info response") - 3.5 chars/token heuristic: ~22,800 tokens - tiktoken cl100k_base: varies by content; closer to truth This let payloads slip past the 25k token limit while still being truncated by Claude Agent SDK's own threshold — the SDK saved them into a file the model couldn't read back, causing 120s timeouts in the eval suite. Switch the estimator to tiktoken's cl100k_base encoding (a real BPE tokenizer with a vocabulary similar to Claude's; tracks Claude's counts within roughly ±10% for English and JSON-heavy content). The char-based heuristic stays as a fallback for environments where tiktoken is not installed; its ratio drops from 3.5 to 3.0 chars per token to be more conservative for JSON content. tiktoken is added as a dependency of the existing ``fastmcp`` extra, so anyone installing ``apache-superset[fastmcp]`` gets it automatically. requirements/base.txt and requirements/development.txt are regenerated via scripts/uv-pip-compile.sh. Tests: - New unit tests cover the tiktoken-loaded path, the unavailable- fallback path, and a defensive fallback when tiktoken's encode raises (the size guard must never fail-open). - One existing middleware test that depended on the old 3.5 chars/token heuristic is now decoupled by mocking estimate_response_tokens directly.

bito-code-review · 2026-05-06T16:52:16Z

Code Review Agent Run #c6406b

Actionable Suggestions - 0

Review Details

Files reviewed - 6 · Commit Range: f29b17a..f29b17a
- pyproject.toml
- requirements/base.txt
- requirements/development.txt
- superset/mcp_service/utils/token_utils.py
- tests/unit_tests/mcp_service/test_middleware.py
- tests/unit_tests/mcp_service/utils/test_token_utils.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

codecov · 2026-05-06T17:05:54Z

Codecov Report

❌ Patch coverage is 0% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.88%. Comparing base (e667ceb) to head (f20e95b).
⚠️ Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
superset/mcp_service/utils/token_utils.py	0.00%	21 Missing ⚠️
superset/mcp_service/middleware.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #39912      +/-   ##
==========================================
- Coverage   63.88%   63.88%   -0.01%     
==========================================
  Files        2583     2583              
  Lines      136602   136617      +15     
  Branches    31501    31502       +1     
==========================================
  Hits        87274    87274              
- Misses      47812    47827      +15     
  Partials     1516     1516

Flag	Coverage Δ
hive	`39.38% <0.00%> (-0.01%)`	⬇️
mysql	`59.04% <0.00%> (-0.02%)`	⬇️
postgres	`59.12% <0.00%> (-0.02%)`	⬇️
presto	`41.07% <0.00%> (-0.01%)`	⬇️
python	`60.56% <0.00%> (-0.02%)`	⬇️
sqlite	`58.76% <0.00%> (-0.02%)`	⬇️
unit	`100.00% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Address review feedback on apache#39912: the lazy imports of estimate_response_tokens, format_size_limit_error, INFO_TOOLS, and truncate_oversized_response inside ResponseSizeGuardMiddleware methods made the conventional patch path (superset.mcp_service.middleware.estimate_response_tokens) raise AttributeError because those names didn't exist on the middleware module. Moving the imports to module level makes patching at "where it's used" work as expected, which is the standard mock convention. Tests now patch superset.mcp_service.middleware.estimate_response_tokens directly rather than the upstream definition module.

aminghadersohi · 2026-05-06T17:55:16Z

Addressed in f20e95b.

Root cause: the middleware was lazy-importing estimate_response_tokens, format_size_limit_error, INFO_TOOLS, and truncate_oversized_response inside the methods, so those names never existed on the superset.mcp_service.middleware module — patching the conventional superset.mcp_service.middleware.estimate_response_tokens path raised AttributeError, which is why I had originally targeted the definition module.

Fix: hoisted all four imports to module level. Now superset.mcp_service.middleware.estimate_response_tokens exists as a normal module attribute and the test patches it where it's used, the way the convention assumes.

Not strictly needed but the right call — keeps tests robust against future churn and gets rid of three pointless local imports.

bito-code-review · 2026-05-07T06:20:04Z

Code Review Agent Run #dfd6f2

Actionable Suggestions - 0

Review Details

Files reviewed - 2 · Commit Range: f29b17a..f20e95b
- superset/mcp_service/middleware.py
- tests/unit_tests/mcp_service/test_middleware.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

pull-request-size Bot added the size/L label May 6, 2026

dosubot Bot added change:backend Requires changing the backend dependencies:python labels May 6, 2026

github-actions Bot removed the dependencies:python label May 6, 2026

aminghadersohi requested a review from Copilot May 6, 2026 17:42

Copilot started reviewing on behalf of aminghadersohi May 6, 2026 17:48 View session

aminghadersohi requested a review from Antonio-RiveroMartnez May 6, 2026 17:48

Antonio-RiveroMartnez approved these changes May 7, 2026

View reviewed changes

aminghadersohi merged commit 9b52031 into apache:master May 7, 2026
65 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mcp): use tiktoken for response-size-guard token estimation#39912

fix(mcp): use tiktoken for response-size-guard token estimation#39912
aminghadersohi merged 2 commits intoapache:masterfrom
aminghadersohi:aminghadersohi/fix-mcp-token-counting

aminghadersohi commented May 6, 2026

Uh oh!

bito-code-review Bot commented May 6, 2026 •

edited

Loading

Code Review Agent Run #c6406b

Uh oh!

codecov Bot commented May 6, 2026 •

edited

Loading

Uh oh!

aminghadersohi commented May 6, 2026

Uh oh!

bito-code-review Bot commented May 7, 2026 •

edited

Loading

Code Review Agent Run #dfd6f2

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aminghadersohi commented May 6, 2026

SUMMARY

BEFORE/AFTER

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

Uh oh!

bito-code-review Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Agent Run #c6406b

Uh oh!

codecov Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

aminghadersohi commented May 6, 2026

Uh oh!

bito-code-review Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Agent Run #dfd6f2

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bito-code-review Bot commented May 6, 2026 •

edited

Loading

codecov Bot commented May 6, 2026 •

edited

Loading

bito-code-review Bot commented May 7, 2026 •

edited

Loading